A Minimum Description Length Proposal for Lossy Data Compression

نویسندگان

  • M. Madiman M. Harrison
  • I. Kontoyiannis
چکیده

We give a development of the theory of lossy data compression from the point of view of statistics. This is partly motivated by the enormous success of the statistical approach in lossless compression, in particular Rissanen’s celebrated Minimum Description Length (MDL) principle. A precise characterization of the fundamental limits of compression performance is given, for arbitrary data sources and with respect to general distortion measures. The starting point for this development is the observation that there is a precise correspondence between compression algorithms and probability distributions (in analogy with the Kraft inequality in lossless compression). This leads us to formulate a version of the MDL principle for lossy data compression. We discuss the consequences of the lossy MDL principle and explain how it leads to potential practical design lessons for vector-quantizer design. We introduce two methods for selecting efficient compression algorithms, the lossy Maximum Likelihood Estimate (LMLE) and the lossy Minimum Description Length Estimate (LMDLE). We describe their theoretical performance and give examples illustrating how the LMDLE has superior performance to the LMLE.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Selection via Rate - Distortion Theory

Rissanen’s Minimum Description Length (MDL) principle for model selection proposes that, among a predetermined collection of models, we choose the one which assigns the shortest description to the data at hand. In this context, a “description” is a lossless representation of the data that also takes into account the cost of describing the chosen model itself. We examine how the MDL principle mi...

متن کامل

Graph Compression

Graphs form the foundation of many real-world datasets. At the same time, the size of graphs presents a big obstacle to understand the essential information they contain. In this report, I mainly review the framework in article [1] for compressing large graphs. It can be used to improve visualization, to understand the high-level structure of the graph, or as a pre-processing step for other dat...

متن کامل

Second-order properties of lossy likelihoods and the MLE/MDL dichotomy in lossy compression

This paper develops a theoretical framework for lossy source coding that treats it as a statistical problem, in analogy to the approach to universal lossless coding suggested by Rissanen’s Minimum Description Length (MDL) principle. Two methods for selecting efficient compression algorithms are proposed, based on lossy variants of the Maximum Likelihood and MDL principles. Their theoretical per...

متن کامل

Wavelet Thresholding Approach for Image Denoising

The original image corrupted by Gaussian noise is a long established problem in signal or image processing .This noise is removed by using wavelet thresholding by focused on statistical modelling of wavelet coefficients and the optimal choice of thresholds called as image denoising . For the first part, threshold is driven in a Bayesian technique to use probabilistic model of the image wavelet ...

متن کامل

Adaptive wavelet thresholding for image denoising and compression

The first part of this paper proposes an adaptive, data-driven threshold for image denoising via wavelet soft-thresholding. The threshold is derived in a Bayesian framework, and the prior used on the wavelet coefficients is the generalized Gaussian distribution (GGD) widely used in image processing applications. The proposed threshold is simple and closed-form, and it is adaptive to each subban...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004